Principal component analysis (PCA) will be used as a dimensionality reduction technique to find the over-arching dimensions that represent knowledge about social relationships. In this study, we will replicate a previous study done nearly 40 years about (Wish et al.,1976), but with a comprehensive list of social relationships.
This dataset was collected from a survey hosted on mturk. The survey data was cleaned with a separate python script. A matrix was created for the average rating of social relationships on dimensions that are thought to characterize these relationships. The dimensions used in this analysis are the same ones from Wish et al., 1976. The relationships list was created using lexical word vector tools to generate a list of all possible social relationships (159 in total).
‘All relationships rated on Wish dimensions’
PCA will output the same number of components as there are dimension inputs. As the components are ranked by how much variance they explain, we can exclude some components which do not add much additional information.
We will use parallel analysis to indicate what the optimal number of components to include would be.
## Parallel analysis suggests that the number of factors = NA and the number of components = 3
## png
## 2
## Parallel analysis suggests that the number of factors = NA and the number of components = 3
Parallel analysis indicates that having only 3 components would be optimal. But to better match the literature, and to be consistent with possible future analyses, we will include 4 components.
PCA with no rotation is done here to visualize the amount of variance accounted for by each component.
## png
## 2
Rotations are used in principal component analyses to be able to better interpret the data. There are two main types of rotations, varimax and oblimin. Here, we will use varimax rotation, as it will maximize the component loadings so that dimensions are more strongly loaded onto a single component, rather than across components. Because of this, our resulting components may correlate with each other. Oblimin rotation results in components that are uncorrelated to each other.
## [1] "First four components account for 88.09% of the variance"
## Component 1 highest positive loadings: Altruistic vs Selfish, Compatible vs incompatible goals and desires, Cooperative vs Competitive, Democratic vs Autocratic, Easy vs Difficult to resolve conflicts with each other, Emotionally close vs distant, Fair vs Unfair, Flexible vs Rigid, Friendly vs Hostile, Harmonious vs Clashing, Productive vs Destructive, Relaxed vs Tense, Sincere vs InsincereNULL
##
## Component 1 highest negative loadings: NULL
##
## Component 2 highest positive loadings: Emotional vs Intellectual, Informal vs Formal, Intense vs Superficial feelings toward each other, Pleasure vs Work orientedNULL
##
## Component 2 highest negative loadings: Important vs Unimportant to societyNULL
##
## Component 3 highest positive loadings: Active vs Inactive, Difficult vs Easy to break off contact with each other, Important vs Unimportant to individuals involved, Intense vs Superficial feelings toward each other, Intense vs Superficial interaction with each other, Interesting vs DullNULL
##
## Component 3 highest negative loadings: NULL
##
## Component 4 highest positive loadings: Democratic vs Autocratic, Equal vs Unequal, Similar vs Different roles and behaviorNULL
##
## Component 4 highest negative loadings: NULL
##
## Spearman's rank correlation rho
##
## data: wish_loadings$wish_dim2 and pv$loadings[, 4]
## S = 666.57, p-value = 2.045e-05
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.7436268
##
## Spearman's rank correlation rho
##
## data: study1_loadings$RC3 and pv$loadings[, 4]
## S = 122, p-value = 1.124e-06
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.9530769
Wish, study 1 and study 2 comparison * Study 2 PC1 is most correlated to wish dimension 1 (valence), and study 1 PC1 (valence) (rho = 0.97, p < .001) * Study 2 PC2 is most correlated to wish dimension 4 (formality) (rho = 0.81, p < .001) and study 1 PC2 (formality) (rho = 0.85, p < .001) * Study 2 PC3 is most correlated to wish dimension 3 (activeness/intensity) (rho = 0.91, p < .001) and study 1 PC4 (activeness) (rho = 0.77, p < .001) * Study 2 PC4 is most correlated to wish dimension 2 (equality) (rho = 0.74, p < .001) and study 1 PC3 (equality) (rho = 0.95, p < .001)
PC1 = Valence
PC2 = Formality
PC3 = Intensity (Activeness)
PC4 = Equality
Compared to study one, we have the same four components. One difference is that Activeness is now better described as Intesity. This is also the third component in study 2, whereas in study 1, it was the fourth component. In study 2, the fourth component is equality, which was the third component in study 1.